GWU NLP at SemEval-2016 Shared Task 1: Matrix Factorization for Crosslingual STS

نویسندگان

  • Hanan Aldarmaki
  • Mona T. Diab
چکیده

We present a matrix factorization model for learning cross-lingual representations for sentences. Using sentence-aligned corpora, the proposed model learns distributed representations by factoring the given data into language-dependent factors and one shared factor. As a result, input sentences from both languages can be mapped into fixed-length vectors and then compared directly using the cosine similarity measure, which achieves 0.8 Pearson correlation on Spanish-English semantic textual similarity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation

Semantic Textual Similarity (STS) measures the meaning similarity of sentences. Applications include machine translation (MT), summarization, generation, question answering (QA), short answer grading, semantic search, dialog and conversational systems. The STS shared task is a venue for assessing the current state-of-the-art. The 2017 task focuses on multilingual and cross-lingual pairs with on...

متن کامل

Samsung Poland NLP Team at SemEval-2016 Task 1: Necessity for diversity; combining recursive autoencoders, WordNet and ensemble methods to measure semantic similarity

This paper describes our proposed solutions designed for a STS core track within the SemEval 2016 English Semantic Textual Similarity (STS) task. Our method of similarity detection combines recursive autoencoders with a WordNet award-penalty system that accounts for semantic relatedness, and an SVM classifier, which produces the final score from similarity matrices. This solution is further sup...

متن کامل

DalGTM at SemEval-2016 Task 1: Importance-Aware Compositional Approach to Short Text Similarity

This paper describes our system submission to the SemEval 2016 English Semantic Textual Similarity (STS) shared task. The proposed system is based on the compositional text similarity model, which aggregates pairwise word similarities for computing the semantic similarity between texts. In addition, our system combines word importance and word similarity to build an importance-similarity matrix...

متن کامل

IHS-RD-Belarus at SemEval-2016 Task 1: Multistage Approach for Measuring Semantic Similarity

This paper describes the system for rating the degree of semantic equivalence between two text snippets developed by IHS-RD-Belarus for the SemEval 2016 STS shared task (Task 1). To predict the human ratings of text similarity we use a support vector regression model with multiple features representing similarity and difference scores calculated for each

متن کامل

NORMAS at SemEval-2016 Task 1: SEMSIM: A Multi-Feature Approach to Semantic Text Similarity

This paper presents the submission of our team (NORMAS) to the SemEval 2016 semantic textual similarity (STS) shared task. We submitted three system runs, each using a set of 36 features extracted from the training set. The runs explore the use of the following three machine learning algorithms: Support Vector Regression, Elastic Net and Random Forest. Each run was trained using sentence pairs ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016